Clustering Gene Expression Data Using Graph Separators
نویسندگان
چکیده
Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process.
منابع مشابه
Impact of the distance choice on clustering gene expression data using graph decompositions
The study of gene interactions is an important research area in biology. Nowadays, highthroughput techniques are available to obtain gene expression data, and grouping genes with similar expression pro les to clusters is a rst mandatory step towards a better understanding of the functional relationships between genes. In Kaba et al. [7], a new clustering approach was presented, using gene inter...
متن کاملRecursive clustering for graph-based gene expression data
A recent trend of research uses graphs to modelize experimental microarray data. Recently, we used graph separators to group a set of 40 genes from a yeast database of Saccharomyces cerevisiae into very coherent clusters. Here, we extend our investigation to all 518 genes of S. cerevisiae which have reacted during the sporulation process. We propose a recursive decomposition into coherent clust...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملخوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- In silico biology
دوره 7 4-5 شماره
صفحات -
تاریخ انتشار 2007